notebooks: Quickstart for model documentation edit by validbeck · Pull Request #372 · validmind/validmind-library

validbeck · 2025-05-14T18:20:53Z

Pull Request Description

sc-7911

What

As part of our initiative to clean up and fortify our Jupyter Notebooks, the model documentation quickstart has been tweaked to be cleaner and have more context for beginner users.
There is now a new "quickstart" directory in notebooks/ and an updated README to accommodate:

Why

Our notebooks really need some TLC — this is the first stepping stone. Cleaning up this notebook also allows for us to build a complementary "Quickstart for model validation" next.

How to Test

Pull down this PR: gh pr checkout 372
Open notebooks/quickstart/quickstart_model_documentation.ipynb to review and run.

Pull Request Dependencies

Changes to the notebooks were also pulled into:

documentation repo — notebooks: Pulling in new Quickstart for model documentation documentation#720
demo-environment repo — https://github.com/validmind/demo-environment/pull/19

External Release Notes

Want to get started with documenting models with the ValidMind Library? Check out our updated Quickstart for model documentation notebook:

Learn the basics of using ValidMind to document models as part of a model development workflow.
Set up the ValidMind Library in your environment, and generate a draft of documentation using ValidMind tests for a binary classification model.

Deployment Notes

Refer to the above section "Pull Request Dependencies."

Breaking Changes

Note

This gets rid of the old notebooks/code_samples/quickstart_customer_churn_full_suite.ipynb file as the new file and directory replaces it.

Links have been fixed in both validmind-library and documentation in the two PRs above.

Screenshots/Videos (Frontend Only)

n/a

Checklist

Areas Needing Special Review

I expanded/broke down the following sections as the original was really compressed and hard to understand why we were performing those tasks, but since I am not a model developer or model expert, someone should double-check that the explanations provided are accurate and relevant for the following:

Preprocessing the raw dataset
Training an XGBoost classifier model

Additional Notes

n/a

LoiAnsah

Review for "Preprocessing the Raw Dataset":

Note: I tried to quote specific sections and suggest an alternative using "->".

- For split the dataset:

"Next...ValidMind" -> Before running test with Validmind, we will need to preprocess the dataset. This involves splitting the data and separating the features (inputs) from the targets (outputs).

"Use preprocess()... parts" -> Use preprocess() to split our dataset into three subsets
"train_df...model." -> Used to train the model. (train because it is the standard term in ML)
"Validation_df...trained" -> Used to evaluate the model's performance during training.
"test_df...data" -> Used later on to asses the model's performance on new, unseen data .

For Separate feature and targets:

My suggestion:

To train the model, we need to provide it with:

Inputs - ....
Outputs (Expected answers/labels) - in our case, we would like to know whether the customer churned or not

Note: I believe there is a "to" missing before hold

Review for "Training an XGBoost classifier model":

error- Measures how....
logloss - Indicates how...
auc - Evaluate how...

Note: I simply added action verbs.

validbeck · 2025-05-14T22:12:51Z

@LoiAnsah These are excellent suggestions. May I suggest you make them official? ;)

GitHub: About reviewing pull requests (EDIT: Oops, forgot the link!)

Optionally, to suggest a specific change to the line or lines, click [ (see image), then edit the text within the suggestion block.

You may run into something interesting when you look at the .ipynb file online — the Jupyter Notebooks primer I wrote that's available under the intern guides may explain some of the oddness. Give it a try anyhow!

LoiAnsah · 2025-05-14T22:38:46Z

@validbeck Will make sure to add them!

CLAassistant · 2025-05-15T13:59:18Z

All committers have signed the CLA.

validbeck · 2025-05-15T16:45:20Z

@LoiAnsah Pushing up a commit is one way you can suggest changes, good job figuring it out! But I actually wanted you to try this feature, as I wanted to make sure you understood how to use it (and this way, the person owning the PR gets to decide whether or not to apply the changes):

Optionally, to suggest a specific change to the line or lines, click (see image), then edit the text within the suggestion block.

I'm going to revert the PR to the previous commit, so you can try the "suggestion" feature again. :)

notebooks/quickstart/quickstart_model_documentation.ipynb

LoiAnsah

I added my suggestions :)

notebooks/quickstart/quickstart_model_documentation.ipynb

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

github-actions · 2025-05-16T16:21:52Z

PR Summary

This pull request refactors the organization of Jupyter notebooks within the project, specifically focusing on the quickstart guide for model documentation using ValidMind. The changes include:

Reorganization of Notebooks: The quickstart_customer_churn_full_suite.ipynb notebook has been removed and its content has been relocated to a new notebook named quickstart_model_documentation.ipynb under the notebooks/quickstart directory. This change aims to improve the logical organization of the notebooks by grouping quickstart guides together.
Updates to Documentation References: References within the notebooks have been updated to reflect the new location of the quickstart guide. This includes updates in markdown cells and code comments to ensure that users are directed to the correct resources.
Script Adjustments: The run_e2e_notebooks.py script has been updated to reflect the new path of the quickstart notebook, ensuring that the end-to-end tests continue to function correctly with the relocated notebook.
Minor Documentation Enhancements: Some markdown cells have been enhanced with additional explanations and links to external resources, such as the Pandas DataFrame documentation, to provide users with more context and learning resources.

These changes are intended to enhance the usability and maintainability of the project by improving the organization and clarity of the documentation resources.

Test Suggestions

Run the run_e2e_notebooks.py script to ensure all notebooks execute without errors.
Verify that all links and references within the notebooks point to the correct locations after the reorganization.
Check that the new quickstart_model_documentation.ipynb notebook functions as expected and produces the correct outputs.
Ensure that the documentation enhancements provide clear and accurate information to users.

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

github-actions · 2025-05-16T16:22:40Z

PR Summary

This pull request refactors the structure of the Jupyter notebooks used in the ValidMind project. The primary change involves relocating the quickstart_customer_churn_full_suite.ipynb notebook to a new location and renaming it to quickstart_model_documentation.ipynb. This change is reflected in the scripts/run_e2e_notebooks.py file, which now points to the new location of the notebook. Additionally, the README and other documentation files have been updated to reflect this change.

The PR also includes minor updates to the documentation within the notebooks, such as clarifying the description of a Pandas DataFrame and ensuring consistent terminology (e.g., changing 'test' to 'testing' datasets). These changes aim to improve the clarity and usability of the documentation for users.

Overall, this PR enhances the organization and readability of the project documentation, making it easier for users to follow the quickstart guide for model documentation using ValidMind.

Test Suggestions

Run the relocated quickstart_model_documentation.ipynb notebook to ensure it executes without errors.
Verify that the scripts/run_e2e_notebooks.py script correctly identifies and runs the relocated notebook.
Check all links and references in the documentation to ensure they point to the correct notebook locations.
Review the updated documentation for clarity and accuracy.

github-actions · 2025-05-16T16:22:40Z

PR Summary

This pull request refactors the structure of the Jupyter notebooks used in the project, specifically focusing on the quickstart guide for model documentation using ValidMind. The main changes include:

Relocation and Renaming: The quickstart_customer_churn_full_suite.ipynb notebook has been removed and replaced with a new notebook quickstart_model_documentation.ipynb located in the notebooks/quickstart directory. This change aims to better organize the notebooks and make the quickstart guide more accessible.
Content Updates: The new quickstart_model_documentation.ipynb notebook includes updated content and structure to guide users through the process of documenting models using ValidMind. It covers importing datasets, initializing the ValidMind library, setting up the environment, and running a full suite of tests.
Documentation and Links: The notebook now includes more detailed explanations and links to relevant documentation, making it easier for users to understand the steps involved in model documentation.
Script Update: The run_e2e_notebooks.py script has been updated to reflect the new path of the quickstart notebook, ensuring that the end-to-end tests are executed on the correct files.
Minor Textual Changes: Some minor textual changes have been made across various notebooks to improve clarity and consistency, such as updating references to Pandas DataFrame and correcting terminology.

These changes aim to improve the usability and organization of the project’s documentation resources, making it easier for users to get started with ValidMind.

Test Suggestions

Run the updated quickstart_model_documentation.ipynb notebook to ensure all steps execute without errors.
Verify that the links to external resources and documentation within the notebook are correct and accessible.
Check that the run_e2e_notebooks.py script correctly executes the relocated quickstart notebook.
Ensure that the new notebook structure and content are clear and provide a comprehensive guide for new users.
Test the notebook in different Python environments to ensure compatibility, especially with the recommended Python versions.

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

validbeck · 2025-05-16T16:31:03Z

@LoiAnsah Thank you for the detailed suggestions! Next, you want to double check that the new changes look good, then press the big ol' "Approve" button:

Approving a pull request with required reviews

LoiAnsah

Looks great to me!

validbeck · 2025-05-16T22:53:10Z

@LoiAnsah Thank you for helping with reviewing this PR — you did awesome!

github-actions · 2025-05-21T17:11:09Z

PR Summary

This pull request refactors the structure of the Jupyter notebooks used for demonstrating the ValidMind library. The main changes include:

Relocation and Renaming: The quickstart_customer_churn_full_suite.ipynb notebook has been removed and replaced with a new notebook quickstart_model_documentation.ipynb located in the notebooks/quickstart directory. This change aims to better organize the quickstart guides and improve clarity.
Documentation Updates: The README and other markdown cells within the notebooks have been updated to reflect the new structure and provide clearer instructions. This includes updating links and descriptions to ensure consistency with the new notebook structure.
Code and Text Enhancements: Minor text edits have been made across various notebooks to improve clarity and consistency, such as adding more detailed descriptions of Pandas DataFrames and ensuring consistent terminology (e.g., 'testing datasets' instead of 'test datasets').
Script Update: The run_e2e_notebooks.py script has been updated to reflect the new path of the quickstart notebook, ensuring that the end-to-end tests run the correct files.

Test Suggestions

Run the quickstart_model_documentation.ipynb notebook to ensure it executes without errors and produces the expected outputs.
Verify that all links in the updated README and notebooks point to the correct resources.
Check the run_e2e_notebooks.py script to ensure it correctly executes the relocated notebook.
Review the markdown content for clarity and accuracy in the context of the new notebook structure.

validbeck added 10 commits May 13, 2025 15:33

New documentation quickstart file

93025bc

Making a copy of old notebook to reference

878e3ab

Save point

3bd5d10

wip

d1fc376

Edits

e917246

Moving notebook

98edc81

Quickstart dir & link patch

252aa09

Editing

a9ae24c

Edit

838b486

Edits

3c244d6

validbeck self-assigned this May 14, 2025

validbeck added the enhancement New feature or request label May 14, 2025

README

191b90a

validbeck mentioned this pull request May 14, 2025

notebooks: Pulling in new Quickstart for model documentation validmind/documentation#720

Merged

validbeck requested review from cachafla and nrichers May 14, 2025 18:42

validbeck marked this pull request as ready for review May 14, 2025 19:54

validbeck requested a review from LoiAnsah May 14, 2025 20:04

LoiAnsah reviewed May 14, 2025

View reviewed changes

validbeck requested a review from LoiAnsah May 14, 2025 22:20

validbeck force-pushed the beck/sc-7911/edit-code-samples-notebooks-quickstart-for branch from 7b59545 to 191b90a Compare May 15, 2025 16:46

LoiAnsah reviewed May 15, 2025

View reviewed changes

notebooks/quickstart/quickstart_model_documentation.ipynb Outdated Show resolved Hide resolved

LoiAnsah reviewed May 15, 2025

View reviewed changes

notebooks/quickstart/quickstart_model_documentation.ipynb Outdated Show resolved Hide resolved

LoiAnsah reviewed May 15, 2025

View reviewed changes

validbeck and others added 2 commits May 16, 2025 09:20

Update notebooks/quickstart/quickstart_model_documentation.ipynb

b815772

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

Update notebooks/quickstart/quickstart_model_documentation.ipynb

10be37d

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

validbeck and others added 4 commits May 16, 2025 09:21

Update notebooks/quickstart/quickstart_model_documentation.ipynb

6e0be1c

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

Update notebooks/quickstart/quickstart_model_documentation.ipynb

fe01c6a

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

Update notebooks/quickstart/quickstart_model_documentation.ipynb

e29d24d

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

Update notebooks/quickstart/quickstart_model_documentation.ipynb

38329f4

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

validbeck and others added 5 commits May 16, 2025 09:21

Update notebooks/quickstart/quickstart_model_documentation.ipynb

cc07056

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

Update notebooks/quickstart/quickstart_model_documentation.ipynb

f64fa2a

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

Update notebooks/quickstart/quickstart_model_documentation.ipynb

e36c0a9

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

Update notebooks/quickstart/quickstart_model_documentation.ipynb

2afccb6

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

Update notebooks/quickstart/quickstart_model_documentation.ipynb

4140a35

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

validbeck and others added 4 commits May 16, 2025 09:24

Update notebooks/quickstart/quickstart_model_documentation.ipynb

9734152

Co-authored-by: Lois Ansah <133300328+LoiAnsah@users.noreply.github.com>

Proofreading

d4c3cd1

Merge remote into local

be1453c

More proofreading...

a3b47dc

validbeck requested a review from LoiAnsah May 16, 2025 16:30

LoiAnsah approved these changes May 16, 2025

View reviewed changes

validbeck added 2 commits May 16, 2025 15:53

2.8.24

36da3ce

Resolving merge conflicts

1fb0f0e

validbeck merged commit bb08dc2 into main May 21, 2025
6 checks passed

validbeck deleted the beck/sc-7911/edit-code-samples-notebooks-quickstart-for branch May 21, 2025 17:12

Conversation

validbeck commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Description

What

Why

How to Test

Pull Request Dependencies

External Release Notes

Deployment Notes

Breaking Changes

Screenshots/Videos (Frontend Only)

Checklist

Areas Needing Special Review

Additional Notes

Uh oh!

LoiAnsah left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Review for "Preprocessing the Raw Dataset":

Review for "Training an XGBoost classifier model":

Uh oh!

validbeck commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LoiAnsah commented May 14, 2025

Uh oh!

CLAassistant commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

validbeck commented May 15, 2025

Uh oh!

Uh oh!

Uh oh!

LoiAnsah left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented May 16, 2025

PR Summary

Test Suggestions

Uh oh!

github-actions bot commented May 16, 2025

PR Summary

Test Suggestions

Uh oh!

github-actions bot commented May 16, 2025

PR Summary

Test Suggestions

Uh oh!

validbeck commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LoiAnsah left a comment

Choose a reason for hiding this comment

Uh oh!

validbeck commented May 16, 2025

Uh oh!

github-actions bot commented May 21, 2025

PR Summary

Test Suggestions

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

validbeck commented May 14, 2025 •

edited

Loading

LoiAnsah left a comment •

edited

Loading

validbeck commented May 14, 2025 •

edited

Loading

CLAassistant commented May 15, 2025 •

edited

Loading

validbeck commented May 16, 2025 •

edited

Loading